[chore] remove mentions of flashrl from the repo and point to vllm quantization support instead#1855
Conversation
There was a problem hiding this comment.
Code Review
This pull request removes the experimental and patched FlashRL integration in favor of native quantized rollouts (FP8) supported directly through vLLM, utilizing Truncated Importance Sampling (TIS) for off-policy correction. This includes deleting FlashRL-specific documentation, examples, environment files, and code paths, while updating the documentation and codebase to reflect the native FP8 rollout workflow. A review comment correctly points out a syntax issue in a bash code block within the new documentation where comments break a backslash-continued multiline command.
|
@erictang000 flashrl cleanup is being done as a part of #1835 |
SumanthRH
left a comment
There was a problem hiding this comment.
Can we make this PR just about a new "Training with Quantized Rollouts" doc?
closes #1658